Overview

Dataset statistics

Number of variables14
Number of observations786600
Missing cells24767
Missing cells (%)0.2%
Duplicate rows546
Duplicate rows (%)0.1%
Total size in memory90.0 MiB
Average record size in memory120.0 B

Variable types

NUM10
CAT2
BOOL2

Warnings

Dataset has 546 (0.1%) duplicate rows Duplicates
customer_id has a high cardinality: 245455 distinct values High cardinality
order_date has a high cardinality: 776 distinct values High cardinality
customer_order_rank has 24767 (3.1%) missing values Missing
voucher_amount is highly skewed (γ1 = 30.39394065) Skewed
platform_id is highly skewed (γ1 = -22.53663783) Skewed
voucher_amount has 743462 (94.5%) zeros Zeros
delivery_fee has 597536 (76.0%) zeros Zeros

Reproduction

Analysis started2020-10-13 20:27:58.649290
Analysis finished2020-10-13 20:29:15.885655
Duration1 minute and 17.24 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

customer_id
Categorical

HIGH CARDINALITY

Distinct245455
Distinct (%)31.2%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
15edce943edd
 
386
8745a335e9cf
 
288
d956116d863d
 
286
0063666607bb
 
273
ae60dce05485
 
270
Other values (245450)
785097 
ValueCountFrequency (%) 
15edce943edd386< 0.1%
 
8745a335e9cf288< 0.1%
 
d956116d863d286< 0.1%
 
0063666607bb273< 0.1%
 
ae60dce05485270< 0.1%
 
a54a8e1579d4254< 0.1%
 
bebb751d49b8253< 0.1%
 
26ed6389a3aa245< 0.1%
 
ef6265f74aca229< 0.1%
 
a333fb175a0c221< 0.1%
 
Other values (245445)78389599.7%
 
2020-10-13T23:29:16.755623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique145498 ?
Unique (%)18.5%
2020-10-13T23:29:16.880313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length12
Mean length12
Min length12

order_date
Categorical

HIGH CARDINALITY

Distinct776
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
2017-01-01
 
4230
2016-12-18
 
3395
2017-02-26
 
3234
2017-02-05
 
3218
2017-02-12
 
3125
Other values (771)
769398 
ValueCountFrequency (%) 
2017-01-0142300.5%
 
2016-12-1833950.4%
 
2017-02-2632340.4%
 
2017-02-0532180.4%
 
2017-02-1231250.4%
 
2016-12-1131000.4%
 
2016-12-0430750.4%
 
2017-01-2230050.4%
 
2017-01-2930030.4%
 
2016-10-0329990.4%
 
Other values (766)75421695.9%
 
2020-10-13T23:29:17.004174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique41 ?
Unique (%)< 0.1%
2020-10-13T23:29:17.128615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

order_hour
Real number (ℝ≥0)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.58879608
Minimum0
Maximum23
Zeros4627
Zeros (%)0.6%
Memory size6.0 MiB
2020-10-13T23:29:17.224203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q116
median18
Q320
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.357192477
Coefficient of variation (CV)0.1908710785
Kurtosis5.749711941
Mean17.58879608
Median Absolute Deviation (MAD)2
Skewness-1.749088644
Sum13835347
Variance11.27074133
MonotocityNot monotonic
2020-10-13T23:29:17.323862image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
1913403017.0%
 
1812965416.5%
 
2010873913.8%
 
179078211.5%
 
21682238.7%
 
16488776.2%
 
15342864.4%
 
22334034.2%
 
13311054.0%
 
14303233.9%
 
Other values (14)771789.8%
 
ValueCountFrequency (%) 
046270.6%
 
124250.3%
 
211870.2%
 
34430.1%
 
4137< 0.1%
 
ValueCountFrequency (%) 
23138321.8%
 
22334034.2%
 
21682238.7%
 
2010873913.8%
 
1913403017.0%
 

customer_order_rank
Real number (ℝ≥0)

MISSING

Distinct369
Distinct (%)< 0.1%
Missing24767
Missing (%)3.1%
Infinite0
Infinite (%)0.0%
Mean9.436809642
Minimum1
Maximum369
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:29:17.438404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q310
95-th percentile39
Maximum369
Range368
Interquartile range (IQR)9

Descriptive statistics

Standard deviation17.77232218
Coefficient of variation (CV)1.88329773
Kurtosis49.04720204
Mean9.436809642
Median Absolute Deviation (MAD)2
Skewness5.494014541
Sum7189273
Variance315.8554356
MonotocityNot monotonic
2020-10-13T23:29:17.561063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
6276033.5%
 
7230492.9%
 
8196962.5%
 
9170132.2%
 
10148891.9%
 
Other values (359)17975622.9%
 
(Missing)247673.1%
 
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
ValueCountFrequency (%) 
3691< 0.1%
 
3681< 0.1%
 
3671< 0.1%
 
3661< 0.1%
 
3651< 0.1%
 

is_failed
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
0
761833 
1
 
24767
ValueCountFrequency (%) 
076183396.9%
 
1247673.1%
 
2020-10-13T23:29:17.646255image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

voucher_amount
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct911
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09148909292
Minimum0
Maximum93.3989
Zeros743462
Zeros (%)94.5%
Memory size6.0 MiB
2020-10-13T23:29:17.882234image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.686
Maximum93.3989
Range93.3989
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4795579176
Coefficient of variation (CV)5.241694963
Kurtosis3886.352852
Mean0.09148909292
Median Absolute Deviation (MAD)0
Skewness30.39394065
Sum71965.32049
Variance0.2299757963
MonotocityNot monotonic
2020-10-13T23:29:18.014191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
074346294.5%
 
1.029116471.5%
 
1.715111341.4%
 
2.05891221.2%
 
0.68636480.5%
 
1.37217700.2%
 
2.74411920.2%
 
2.57258970.1%
 
3.435430.1%
 
0.5145373< 0.1%
 
Other values (901)28120.4%
 
ValueCountFrequency (%) 
074346294.5%
 
0.0034335< 0.1%
 
0.284691< 0.1%
 
0.322421< 0.1%
 
0.34319< 0.1%
 
ValueCountFrequency (%) 
93.39891< 0.1%
 
78.029071< 0.1%
 
68.39421< 0.1%
 
61.825751< 0.1%
 
37.575651< 0.1%
 

delivery_fee
Real number (ℝ≥0)

ZEROS

Distinct98
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1811799318
Minimum0
Maximum9.86
Zeros597536
Zeros (%)76.0%
Memory size6.0 MiB
2020-10-13T23:29:18.148599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.986
Maximum9.86
Range9.86
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3697095668
Coefficient of variation (CV)2.040565769
Kurtosis8.481347092
Mean0.1811799318
Median Absolute Deviation (MAD)0
Skewness2.417459196
Sum142516.1343
Variance0.1366851638
MonotocityNot monotonic
2020-10-13T23:29:18.273982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
059753676.0%
 
0.493706179.0%
 
0.986357354.5%
 
0.7395347904.4%
 
0.246576641.0%
 
1.232571640.9%
 
1.47967680.9%
 
1.429750780.6%
 
0.4683530970.4%
 
0.443726570.3%
 
Other values (88)154942.0%
 
ValueCountFrequency (%) 
059753676.0%
 
0.0246510< 0.1%
 
0.04933< 0.1%
 
0.09864< 0.1%
 
0.1479303< 0.1%
 
ValueCountFrequency (%) 
9.861< 0.1%
 
7.3951< 0.1%
 
6.65551< 0.1%
 
6.4091< 0.1%
 
5.9161< 0.1%
 

amount_paid
Real number (ℝ≥0)

Distinct6471
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.18327131
Minimum0
Maximum1131.03
Zeros872
Zeros (%)0.1%
Memory size6.0 MiB
2020-10-13T23:29:18.407094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5135
Q16.64812
median9.027
Q312.213
95-th percentile19.5408
Maximum1131.03
Range1131.03
Interquartile range (IQR)5.56488

Descriptive statistics

Standard deviation5.6181212
Coefficient of variation (CV)0.5517010233
Kurtosis2243.912588
Mean10.18327131
Median Absolute Deviation (MAD)2.655
Skewness15.5881411
Sum8010161.21
Variance31.56328582
MonotocityNot monotonic
2020-10-13T23:29:18.533056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5.31146671.9%
 
7.965144101.8%
 
6.372118781.5%
 
8.496103501.3%
 
6.90399881.3%
 
5.84197341.2%
 
9.02792131.2%
 
7.43491561.2%
 
10.6289821.1%
 
9.55883771.1%
 
Other values (6461)67984586.4%
 
ValueCountFrequency (%) 
08720.1%
 
0.005311< 0.1%
 
0.015931< 0.1%
 
0.026551< 0.1%
 
0.037171< 0.1%
 
ValueCountFrequency (%) 
1131.031< 0.1%
 
581.71051< 0.1%
 
363.018151< 0.1%
 
353.38051< 0.1%
 
246.888451< 0.1%
 

restaurant_id
Real number (ℝ≥0)

Distinct13569
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162864079.3
Minimum73498
Maximum340453498
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:29:18.668219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum73498
5-th percentile29803498
Q186023498
median169613498
Q3228433498
95-th percentile302393498
Maximum340453498
Range340380000
Interquartile range (IQR)142410000

Descriptive statistics

Standard deviation87830821.23
Coefficient of variation (CV)0.5392890906
Kurtosis-1.08595334
Mean162864079.3
Median Absolute Deviation (MAD)71240000
Skewness-0.02254910338
Sum1.281088848e+14
Variance7.714253157e+15
MonotocityNot monotonic
2020-10-13T23:29:18.792590image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3762349813170.2%
 
98349810710.1%
 
19267349810310.1%
 
1545434989990.1%
 
887734989670.1%
 
1467234989420.1%
 
1052534989350.1%
 
186034989220.1%
 
306334989180.1%
 
295934988820.1%
 
Other values (13559)77661698.7%
 
ValueCountFrequency (%) 
73498120< 0.1%
 
12349837< 0.1%
 
153498193< 0.1%
 
173498181< 0.1%
 
19349884< 0.1%
 
ValueCountFrequency (%) 
3404534981< 0.1%
 
3400934982< 0.1%
 
3400334981< 0.1%
 
3399834982< 0.1%
 
3399134981< 0.1%
 

city_id
Real number (ℝ≥0)

Distinct3749
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47179.7505
Minimum230
Maximum100205
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:29:18.926225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum230
5-th percentile10346
Q124799
median46467
Q367886
95-th percentile89749
Maximum100205
Range99975
Interquartile range (IQR)43087

Descriptive statistics

Standard deviation25904.63056
Coefficient of variation (CV)0.5490624747
Kurtosis-1.018564164
Mean47179.7505
Median Absolute Deviation (MAD)21419
Skewness0.05185593619
Sum3.711159174e+10
Variance671049884.7
MonotocityNot monotonic
2020-10-13T23:29:19.052707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
103468665411.0%
 
20326362104.6%
 
80562341004.3%
 
50898216272.7%
 
40441167322.1%
 
60537147601.9%
 
44366141191.8%
 
45358112461.4%
 
4334111061.4%
 
90633104491.3%
 
Other values (3739)52959767.3%
 
ValueCountFrequency (%) 
2309930.1%
 
129865190.8%
 
167677< 0.1%
 
168533< 0.1%
 
168918< 0.1%
 
ValueCountFrequency (%) 
1002051< 0.1%
 
1000791< 0.1%
 
1000613< 0.1%
 
10004856< 0.1%
 
999995< 0.1%
 

payment_id
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1668.509077
Minimum1491
Maximum1811
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:29:19.154634image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1491
5-th percentile1523
Q11619
median1619
Q31779
95-th percentile1779
Maximum1811
Range320
Interquartile range (IQR)160

Descriptive statistics

Standard deviation87.19266546
Coefficient of variation (CV)0.05225783105
Kurtosis-1.011622604
Mean1668.509077
Median Absolute Deviation (MAD)0
Skewness0.2658271582
Sum1312449240
Variance7602.56091
MonotocityNot monotonic
2020-10-13T23:29:19.236733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
161947660060.6%
 
177923413329.8%
 
1491364974.6%
 
1811344924.4%
 
152348780.6%
 
ValueCountFrequency (%) 
1491364974.6%
 
152348780.6%
 
161947660060.6%
 
177923413329.8%
 
1811344924.4%
 
ValueCountFrequency (%) 
1811344924.4%
 
177923413329.8%
 
161947660060.6%
 
152348780.6%
 
1491364974.6%
 

platform_id
Real number (ℝ≥0)

SKEWED

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29868.52938
Minimum525
Maximum30423
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:29:19.331922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum525
5-th percentile29463
Q129463
median29815
Q330231
95-th percentile30359
Maximum30423
Range29898
Interquartile range (IQR)768

Descriptive statistics

Standard deviation1160.893265
Coefficient of variation (CV)0.03886677012
Kurtosis565.3036862
Mean29868.52938
Median Absolute Deviation (MAD)352
Skewness-22.53663783
Sum2.349458521e+10
Variance1347673.174
MonotocityNot monotonic
2020-10-13T23:29:19.422090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
2946324152330.7%
 
3023121672627.6%
 
2981515897220.2%
 
3035910365313.2%
 
30391244343.1%
 
29751193212.5%
 
29495111511.4%
 
3042368190.9%
 
3019920790.3%
 
52510940.1%
 
Other values (4)8280.1%
 
ValueCountFrequency (%) 
52510940.1%
 
221673< 0.1%
 
22263232< 0.1%
 
222951< 0.1%
 
2946324152330.7%
 
ValueCountFrequency (%) 
3042368190.9%
 
30391244343.1%
 
3035910365313.2%
 
3023121672627.6%
 
3019920790.3%
 

transmission_id
Real number (ℝ≥0)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4253.246112
Minimum212
Maximum21124
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:29:19.519058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum212
5-th percentile4228
Q14228
median4324
Q34356
95-th percentile4356
Maximum21124
Range20912
Interquartile range (IQR)128

Descriptive statistics

Standard deviation572.8556657
Coefficient of variation (CV)0.1346866959
Kurtosis176.6261099
Mean4253.246112
Median Absolute Deviation (MAD)32
Skewness-0.9114324558
Sum3345603392
Variance328163.6137
MonotocityNot monotonic
2020-10-13T23:29:19.608556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
435634173443.4%
 
432420366825.9%
 
422820161725.6%
 
4260145381.8%
 
212126761.6%
 
499667370.9%
 
419652760.7%
 
1988207< 0.1%
 
21124146< 0.1%
 
20201< 0.1%
 
ValueCountFrequency (%) 
212126761.6%
 
1988207< 0.1%
 
20201< 0.1%
 
419652760.7%
 
422820161725.6%
 
ValueCountFrequency (%) 
21124146< 0.1%
 
499667370.9%
 
435634173443.4%
 
432420366825.9%
 
4260145381.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
1
408889 
0
377711 
ValueCountFrequency (%) 
140888952.0%
 
037771148.0%
 
2020-10-13T23:29:19.678862image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-10-13T23:28:43.645627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:43.858916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:44.077035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:44.300490image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:44.537026image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:44.762910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:45.004908image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:45.240569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:45.453044image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:45.685682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:45.922242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:46.201173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:46.458784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:46.819460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:47.135351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:47.432240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:47.743053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:47.988691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:48.238582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:48.518567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:48.812527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:49.082693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:49.409674image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:49.680669image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:50.058247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:50.392103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:50.638901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:50.859250image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:51.097771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:51.334511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:51.602196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:51.837212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:52.079695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:52.320480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:52.553782image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:52.780603image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:53.032195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:53.282014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:53.526137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:53.784889image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:54.096012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:54.510218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:54.791146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:55.032284image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:55.341221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:55.615613image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:55.905334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:56.185171image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:56.482663image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:56.762046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:57.055644image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:57.331760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:57.612322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:57.901945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:58.194720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:58.480700image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:58.776670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:59.075061image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:59.490041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:28:59.749192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:00.024466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:00.277239image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:00.526470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:00.786073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:01.054648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:01.280900image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:01.506060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:01.744838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:01.960393image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:02.192853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:02.466651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:02.778886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:03.083894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:03.429050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:03.727076image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:03.988938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:04.276595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:04.667517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:04.945575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:05.230948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:05.536613image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:05.820217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:06.071003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:06.313225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:06.550329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:06.779985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:07.035368image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:07.328606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:07.627196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:07.925836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:08.173362image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:08.382552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:08.600908image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:08.817778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:09.040165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:09.304110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:09.575096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:09.861580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:10.140981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:10.394624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-10-13T23:29:19.754143image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-13T23:29:19.947600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-13T23:29:20.140501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-13T23:29:20.335400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-10-13T23:29:11.145693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:12.506719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:29:15.320332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
0000097eabfd92015-06-20191.000.00.00011.4696058034982032617793023143560
10000e2c6d9be2016-01-29201.000.00.0009.558002393034987654716193035943560
2000133bb597f2017-02-26191.000.00.4935.936582064634983383316193035943241
300018269939b2017-02-05171.000.00.4939.82350366134989931516193035943560
40001a00468a62015-08-04191.000.00.4935.150702258534981645616192946343560
50001d9036b5e2015-08-29191.000.00.00011.947501936434988827616192946343560
60001d9036b5e2017-01-04172.000.00.00011.151001936434988827616192946343560
70001d9036b5e2017-01-28163.000.00.0009.717301936434988827616193035943560
80001e1e04d7d2015-10-24191.000.00.00025.222501448334984535816192946343561
90001e1e04d7d2016-03-24192.000.00.0009.29250959534984535816192946343241

Last rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
786590fffcf45e5c692016-11-19121.000.00.000012.531601074634983933516192946343560
786591fffcf45e5c692017-02-04122.000.00.000011.575801074634983933516193035943560
786592fffd696eaedd2015-09-14121.000.01.429724.13395953234988056217792946343560
786593fffe9d5a8d412016-07-3121NaN10.00.00008.44290156133498103461811294632121
786594fffe9d5a8d412016-09-30201.000.00.000010.726209834981034617792946342281
786595fffe9d5a8d412016-09-3020NaN10.00.000010.72620983498103461779294632121
786596ffff347c3cfa2016-08-17211.000.00.00007.59330528934984197816193035943561
786597ffff347c3cfa2016-09-15212.000.00.00005.947201646534984197816193035943561
786598ffff4519b52d2016-04-02191.000.00.000021.77100163634988056214912975142280
786599ffffccbfc8a42015-05-30201.000.00.000016.461001502934984595216192946343240